Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition

نویسندگان

  • Bagher BabaAli
  • Hossein Sameti
  • Mehran Safayani
چکیده

Automatic speech recognition performance degrades significantly when speech is affected by environmental noise. Nowadays, the major challenge is to achieve good robustness in adverse noisy conditions so that automatic speech recognizers can be used in real situations. Spectral subtraction (SS) is a well-known and effective approach; it was originally designed for improving the quality of speech signal judged by human listeners. SS techniques usually improve the quality and intelligibility of speech signal while speech recognition systems need compensation techniques to reduce mismatch between noisy speech features and clean trained acoustic model. Nevertheless, correlation can be expected between speech quality improvement and the increase in recognition accuracy. This paper proposes a novel approach for solving this problem by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy. This will incorporate important information of the statistical models of the recognition engine as a feedback for tuning SS parameters. By using this architecture, we overcome the drawbacks of previously proposed methods and achieve better recognition accuracy. Experimental evaluations show that the proposed method can achieve significant improvement of recognition rates across a wide range of signal to noise ratios.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spectral subtraction in likelihood-maximizing framework for robust speech recognition

Spectral Subtraction (SS), as a speech enhancement technique, originally designed for improving quality of speech signal judged by human listeners. it usually improve the quality and intelligibility of speech signals, while the speech recognition systems need compensation techniques capable of reducing the mismatch between the noisy speech features and the clean models. This paper proposes a no...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Q-Gaussian based spectral subtraction for robust speech recognition

Spectral subtraction (SS) is derived using maximum likelihood estimation assuming both noise and speech follow Gaussian distributions and are independent from each other. Under this assumption, noisy speech, speech contaminated by noise, also follows a Gaussian distribution. However, it is well known that noisy speech observed in real situations often follows a heavytailed distribution, not a G...

متن کامل

Robust Speech Recognition Using Speech Enhancement

Automatic Speech Recognition (ASR) has matured into a technology which is becoming more common in our everyday lives, and is emerging as a necessity to minimise driver distraction when operating in-car systems such as navigation and infotainment. In “noise-free” environments, word recognition performance of these systems has been shown to approach 100%, however this performance degrades rapidly...

متن کامل

Combined Spectral Subtraction and Cepstral Normalisation for Robust Speech Recognition

This paper presents an effective feature processing algorithm for robust speech recognition, based on combined spectral and cepstral processing. The spectral processing consists of FullWave Rectification Spectral Subtraction (FWR-SS) and Likelihood Controlled Instantaneous Noise Estimation (LCINE) while the cepstral processing is based on meanand variance normalisation. The combination is motiv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • EURASIP J. Adv. Sig. Proc.

دوره 2009  شماره 

صفحات  -

تاریخ انتشار 2009